# Interpretable NICM ML Model
In this project, we implemented an explainable ML model (based on XGBoost) to gain insights into the association between CMR imaging markers and adverse outcomes of CV hospitalization and all-cause death. 
The explainable ML model was used to determine the contributions (using SHAP analysis) of the different risk predictors. 

## Requirments: ##
[Python 3.5](https://www.python.org/downloads/release/python-352/)
Anaconda package manager (https://www.anaconda.com/)

## Setup: ##
Create environment from attached yml file
    conda env create -f environment.yml
    
Activate conda Virtualenv
    conda activate my_env
## Execution: ##

1. Select input data by updating the file-paths in hyper_params.py

2. Run train.py

## File Descriptions: ##

- hyperparams.py - Specifies file paths and sheet names for training.

- load_data.py - Loads the data from the excel sheet.

- train.py - Trains a model using grid search.

## Hyper-parameters Hints: ##
### To optimize the hyperparameters
In hyperparams.py, Set use_grid_search = True
### To test effect of reducing event rate
In hyperparams.py, Set num_pat_excluded = 3 # to exclude 3 positive patients
### To change number of cross-validations
#### 1) Determining optimal model: 
In train.py, change num_folds = 10 (from 10 to any other value)
#### 2) Testing reproducibility of the model:
In train.py, change N = 1 (from 1 to any other value)
Note: the main run reported in manuscript was produced by setting N = 1
